A bit review ANOVA & ANCOVA in the Frequentist’s view
“ANOVA & ANCOVA” in Bayesian Context
Contrast Comparison
If we pay a bit attention to the ANOVA and ANCOVA, we can see the interesting point of views in these cases:
This type of analysis actually is a model with metric output and one nominal predictor.
We represent the nominal predictor by a vector \(\vec{x}\) = {\(x_{1}, ..., x_{J}\)}, where \(J\) is the number of categories that the predictor has. When an individual falls in group \(j\) of the nominal predictor, this is represented by setting \(x_{j} = 1\) and \(x_{i\neq j} = 0\). The predicted value, denoted \(\mu\), is the overall baseline plus the deflection of a group:
\[\begin{align*} \mu &= \beta_0 + \sum_j \beta_{j} x_{j} \\ &= \beta_0 + \vec{\beta} \cdot \vec{x} \end{align*}\]
The overall baseline is constrained so that the deflections sum to
\(0\) across the categories:
\[ \sum_j \beta_{[j]} = 0 \]
In order to satisfy this constraint, first, find unconstrained slopes \(\alpha_1, ..., \alpha_J\) and then adjust them using formulas:
\[\begin{align*} \mu &= \alpha_0 + \sum_j \alpha_{j} x_{j} \\ &= (\alpha_0 + \bar{\alpha}) + \sum_j (\alpha_{j} - \bar{\alpha}) \cdot x_{j}\\ &= \beta_0 + \sum_j \beta_{j} x_{j} \end{align*}\]
where \(\bar{\alpha} = \frac{1}{n} \sum_j \alpha_j\)
On the diagram the within-group standard deviation \(\sigma\) is given a broad uniform prior, the intercept is given a broad normal prior centered at \(0\) and the group coefficients come from zero-centered normal distribution.
Interpretation of intercept is the grand mean value.
The standard deviation of slopes \(\sigma_{\beta}\) may be either a constant or may have a prior of its own.
A key assumption behind estimating \(\sigma_{\beta}\) from the data is that all groups can be meaningfully described as representatives of shared higher-level distribution.
Example. Giving \(\sigma_{\beta}\) a prior distribution is not justified if, for example, in medical studies there are several control groups and one treated group. In this case small difference between placebo groups may result in significant shrinkage (bias) of the treated group.
In this example the treated group is an outlier. Heavy-tailed prior distribution for \(\beta_j\) may help to overcome excessive shrinkage.
Possible priors for \(\sigma_{\alpha}\):
Another interpretation of imploding shrinkage is: the model has to
make choice between assigning more variability to between the groups
(\(\sigma_{\alpha}\)) or shrinking the
groups and assigning more variability to within the groups noise (\(\sigma_{y}\)).
Whenever it is possible the model will prefer the latter.
Shrinkage will not be very efficient in case of binary predictor.
The fruit fly, Drosophila melanogaster, is known for this species that newly inseminated females will not remate (for at least two days), and males will not actively court pregnant females. There were 25 male fruit flies in each of the five groups.
CompanionNumber
: group
None0
: males with zero companionsPregnant1
: males with one pregnant femalePregnant8
: males accompanied by eight pregnant
femalesVirgin1
: males accompanied by one virgin femaleVirgin8
: males accompanied by eight virgin femalesResearch Question: estimate the life spans and the magnitude of differences between groups.
#read the data, the file is available at [K].
<- read.csv("data/Fruitfly.csv")
dta names(dta)
[1] "Longevity" "CompanionNumber" "Thorax"
datatable(dta)
library(tidyverse)
<-group_by(dta, CompanionNumber) %>%
grMeanssummarise(
count = n(),
mean = mean(Longevity, na.rm = TRUE),
sd = sd(Longevity, na.rm = TRUE)
)levels(as.factor(dta$CompanionNumber))
[1] "None0" "Pregnant1" "Pregnant8" "Virgin1" "Virgin8"
grMeans
# A tibble: 5 × 4
CompanionNumber count mean sd
<chr> <int> <dbl> <dbl>
1 None0 25 63.6 16.5
2 Pregnant1 25 64.8 15.7
3 Pregnant8 25 63.4 14.5
4 Virgin1 25 56.8 14.9
5 Virgin8 25 38.7 12.1
# Box plots
# ++++++++++++++++++++
# Plot weight by group and color by group
library("ggpubr")
ggboxplot(dta, x = "CompanionNumber", y = "Longevity",
color = "CompanionNumber",
order = c("None0", "Pregnant1", "Pregnant8", "Virgin1", "Virgin8"),
ylab = "Longevity", xlab = "Companion Number")
ggline(dta, x = "CompanionNumber", y = "Longevity",
add = c("mean_se", "jitter"),
order = c("None0", "Pregnant1", "Pregnant8", "Virgin1", "Virgin8"),
ylab = "Longevity", xlab = "Companion Number")
# Compute the analysis of variance
<- aov(Longevity ~ CompanionNumber, data = dta)
longevity.aov # Summary of the analysis
summary(longevity.aov)
Df Sum Sq Mean Sq F value Pr(>F)
CompanionNumber 4 11939 2984.8 13.61 3.52e-09 ***
Residuals 120 26314 219.3
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Interpret the result of one-way ANOVA tests:
As the p-value is less than the significance level \(0.05\), we can conclude that there are significant differences between the groups highlighted with “***” in the model summary.
Tukey multiple pairwise-comparisons
As the ANOVA test is significant, we can compute Tukey HSD (Tukey Honest Significant Differences, R function: TukeyHSD()) for performing multiple pairwise-comparison between the means of groups.
The function TukeyHD() takes the fitted ANOVA as an argument.
TukeyHSD(longevity.aov)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = Longevity ~ CompanionNumber, data = dta)
$CompanionNumber
diff lwr upr p adj
Pregnant1-None0 1.24 -10.36047 12.840468 0.9983034
Pregnant8-None0 -0.20 -11.80047 11.400468 0.9999988
Virgin1-None0 -6.80 -18.40047 4.800468 0.4854206
Virgin8-None0 -24.84 -36.44047 -13.239532 0.0000003
Pregnant8-Pregnant1 -1.44 -13.04047 10.160468 0.9969591
Virgin1-Pregnant1 -8.04 -19.64047 3.560468 0.3126549
Virgin8-Pregnant1 -26.08 -37.68047 -14.479532 0.0000001
Virgin1-Pregnant8 -6.60 -18.20047 5.000468 0.5157692
Virgin8-Pregnant8 -24.64 -36.24047 -13.039532 0.0000004
Virgin8-Virgin1 -18.04 -29.64047 -6.439532 0.0003240
Check ANOVA assumptions: test validity?
# 1. Homogeneity of variances
plot(longevity.aov, 1)
# 2. Normality
plot(longevity.aov, 2)
# Extract the residuals
<- residuals(object = longevity.aov )
aov_residuals # Run Shapiro-Wilk test
shapiro.test(x = aov_residuals ) #Shapiro-Wilk test on the ANOVA residuals (W = 0.99, p = 0.4) which finds no indication that normality is violated.
Shapiro-Wilk normality test
data: aov_residuals
W = 0.98887, p-value = 0.4088
A Shapiro-Wilk test on the ANOVA residuals (W = 0.99, p = 0.4) which finds no indication that normality is violated.
<- list(Ntotal = nrow(dta),
dataList y = dta$Longevity,
x = as.integer(as.factor(dta$CompanionNumber)),
NxLvl = nlevels(as.factor(dta$CompanionNumber)),
agammaShRa=unlist(gammaShRaFromModeSD(mode = sd(dta$Longevity)/2,
sd = 2*sd(dta$Longevity))))
Function gammaShRaFromModeSD()
from the book calculates
shape and rate parameters of gamma distribution from mode and standard
deviation.
#view source code of function gammaShRaFromModeSD()
gammaShRaFromModeSD
function (mode, sd)
{
if (mode <= 0)
stop("mode must be > 0")
if (sd <= 0)
stop("sd must be > 0")
rate = (mode + sqrt(mode^2 + 4 * sd^2))/(2 * sd^2)
shape = 1 + mode * rate
return(list(shape = shape, rate = rate))
}
<-gammaShRaFromModeSD(mode=sd(dta$Longevity)/2,sd=2*sd(dta$Longevity))) (ShRa
$shape
[1] 1.283196
$rate
[1] 0.03224747
With these parameters standard deviation of slopes is concentrated around a relatively small value 8.7819463, but has fat enough right tail.
<- seq(from=0.001, to=100, by=1 )
xAxis plot(xAxis, dgamma(xAxis, shape = ShRa$shape, rate = ShRa$rate), type='l',
ylab="Gamma Density",xlab="Sigma_Alpha", main="Prior for Sigma_Alpha: mode=8.78")
abline(v=sd(dta$Longevity)/2)
Prepare the model description.
<-"
modelStringdata {
int<lower=1> Ntotal;
real y[Ntotal];
int<lower=2> NxLvl;
int<lower=1, upper=NxLvl> x[Ntotal];
real<lower=0> agammaShRa[2];
}
transformed data {
real meanY;
real sdY;
meanY = mean(y);
sdY = sd(y);
}
parameters {
real a0;
real<lower=0> aSigma;
vector[NxLvl] a;
real<lower=0> ySigma;
}
model {
a0 ~ normal(meanY, 5*sdY);
aSigma ~ gamma(agammaShRa[1], agammaShRa[2]);
a ~ normal(0, aSigma);
ySigma ~ uniform(sdY/100, sdY*10);
for ( i in 1:Ntotal ) {
y[i] ~ normal(a0 + a[x[i]], ySigma);
}
}
generated quantities {
// Convert a0,a[] to sum-to-zero b0,b[] :
real b0;
vector[NxLvl] b;
b0 = a0 + mean(a);
b = a - mean(a);
}"
Run MCMC.
<- stan_model(model_code = modelString) stanDso
If saved DSO is used load it, then run the chains.
# saveRDS(stanDso,file="data/stanDSO.Rds")
<- readRDS("data/stanDSO.Rds") stanDso
<- sampling(stanDso,
fit data = dataList,
pars = c('b0', 'b', 'aSigma', 'ySigma'),
iter = 5000, chains = 2, cores = 4)
Analyze the chains using shinystan()
launch_shinystan(fit)
summary(fit)$summary[,c(1,4,6,8,10)]
mean 2.5% 50% 97.5% Rhat
b0 57.4393525 54.8635804 57.4343609 60.01657 1.0001866
b[1] 5.7588760 0.5134606 5.7673484 10.97372 0.9999060
b[2] 6.9174882 1.8407142 6.8867440 12.16411 0.9999481
b[3] 5.5581312 0.4278215 5.5290942 10.80727 0.9997391
b[4] -0.5954606 -5.7719881 -0.5929975 4.69568 1.0000531
b[5] -17.6390348 -22.7535663 -17.6253508 -12.37578 1.0002923
aSigma 15.4223486 6.3137021 13.3207333 37.62849 1.0021829
ySigma 14.9461756 13.2475061 14.9014775 16.95289 1.0005380
lp__ -409.5860642 -414.8603961 -409.2042069 -406.39237 1.0017972
cbind(GroupMeans=grMeans,
EstimatedMeans=summary(fit)$summary[2:6,1]+summary(fit)$summary[1,1])
GroupMeans.CompanionNumber GroupMeans.count GroupMeans.mean
b[1] None0 25 63.56
b[2] Pregnant1 25 64.80
b[3] Pregnant8 25 63.36
b[4] Virgin1 25 56.76
b[5] Virgin8 25 38.72
GroupMeans.sd EstimatedMeans
b[1] 16.45215 63.19823
b[2] 15.65248 64.35684
b[3] 14.53983 62.99748
b[4] 14.92838 56.84389
b[5] 12.10207 39.80032
Note differences between estimated group means and also shrinkage.
stan_ac(fit, separate_chains = T)
pairs(fit)
plot(fit)
plot(fit,pars=c("b"))
stan_dens(fit)
From MCMC we see immediately that not all parameters \(\beta_j\) are zeros (omnibus test).
As usual, the following question in ANOVA is: which group means are not
equal.
To answer this question we use contrasts.
In FNP approach there is a constraint on how many pairs or subgroups can be checked with contrasts with desired accuracy of inference.
In Bayesian approach testing of contrasts is straight forward and does not have restrictions: each step of chain returns a combination of group means. Thus we only need to compare frequency of these combinations in the chain.
Calculate contrasts for betas
.
Extract chains for parameters b
.
<- rstan::extract(fit)
fit_ext head(fit_ext$b)
iterations [,1] [,2] [,3] [,4] [,5]
[1,] 7.477280 3.4507237 6.191515 -4.0575910 -13.06193
[2,] 3.379211 12.8289133 5.852268 0.1562871 -22.21668
[3,] 7.777005 0.5933093 6.875631 0.7496060 -15.99555
[4,] 6.199593 4.9381578 5.802773 2.4810871 -19.42161
[5,] 8.107023 4.9390397 6.255589 -1.3545609 -17.94709
[6,] 5.496331 7.7539725 9.307447 -5.7502948 -16.80746
dim(fit_ext$b)
[1] 5000 5
<- fit_ext$b[,1] - fit_ext$b[,5]
contrast_1_5 plot(contrast_1_5)
hist(contrast_1_5)
<- hdi(contrast_1_5)) (hdiContrast_1_5
lower upper
15.21982 31.95605
attr(,"credMass")
[1] 0.95
<- sd(contrast_1_5)) (sd.contrast_1_5
[1] 4.256989
plot(rank(fit_ext$b[,1]), rank(fit_ext$b[,5]))
<-(fit_ext$b[,1] + fit_ext$b[,2] + fit_ext$b[,3])/3
Comb1<-(fit_ext$b[,4] + fit_ext$b[,5])/2
Comb2<- Comb1 - Comb2
contrast_123_45 plot(contrast_123_45)
hist(contrast_123_45)
<-hdi(contrast_123_45)) (hdiContrast_123_45
lower upper
9.713508 20.416612
attr(,"credMass")
[1] 0.95
<-sd(contrast_123_45)) (sd.contrast_123_45
[1] 2.734301
head(cbind(Comb1,Comb2))
Comb1 Comb2
[1,] 5.706506 -8.559759
[2,] 7.353464 -11.030196
[3,] 5.081982 -7.622973
[4,] 5.646841 -8.470262
[5,] 6.433884 -9.650826
[6,] 7.519250 -11.278875
plot(Comb1[1:100],Comb2[1:100])
plot(rank((fit_ext$b[,1] + fit_ext$b[,2] + fit_ext$b[,3])/3),rank((fit_ext$b[,4] + fit_ext$b[,5])/2))
Define the model
<-"
modelStringdata {
int<lower=1> Ntotal;
vector[Ntotal] y;
int<lower=2> Nx1Lvl;
int<lower=2> Nx2Lvl;
int<lower=1, upper=Nx1Lvl> x1[Ntotal];
int<lower=1, upper=Nx2Lvl> x2[Ntotal];
real<lower=0> agammaShRa[2];
}
transformed data {
real meanY;
real sdY;
vector[Ntotal] zy;
meanY = mean(y);
sdY = sd(y);
zy = (y - mean(y)) / sdY; // center & normalize
}
parameters {
real a0;
real<lower=0> a1Sigma;
real<lower=0> a2Sigma;
real<lower=0> a1a2Sigma;
vector[Nx1Lvl] a1;
vector[Nx2Lvl] a2;
matrix[Nx1Lvl,Nx2Lvl] a1a2;
real<lower=0> zySigma;
}
model {
a0 ~ normal(0, 1);
a1Sigma ~ gamma(agammaShRa[1], agammaShRa[2]);
a1 ~ normal(0, a1Sigma);
a2Sigma ~ gamma(agammaShRa[1], agammaShRa[2]);
a2 ~ normal(0, a2Sigma);
a1a2Sigma ~ gamma(agammaShRa[1], agammaShRa[2]);
for (j1 in 1:Nx1Lvl) {
a1a2[j1,] ~ normal(0, a1a2Sigma);
}
zySigma ~ uniform(1.0/10, 10);
for ( i in 1:Ntotal ) {
zy[i] ~ normal(a0 + a1[x1[i]] + a2[x2[i]]+ a1a2[x1[i],x2[i]], zySigma);
}
}
generated quantities {
// Convert a to sum-to-zero b :
real b0;
vector[Nx1Lvl] b1;
vector[Nx2Lvl] b2;
matrix[Nx1Lvl,Nx2Lvl] b1b2;
matrix[Nx1Lvl,Nx2Lvl] m;
real<lower=0> b1Sigma;
real<lower=0> b2Sigma;
real<lower=0> b1b2Sigma;
real<lower=0> ySigma;
for ( j1 in 1:Nx1Lvl ) { for ( j2 in 1:Nx2Lvl ) {
m[j1,j2] = a0 + a1[j1] + a2[j2] + a1a2[j1,j2]; // cell means
} }
b0 = mean(m);
for ( j1 in 1:Nx1Lvl ) { b1[j1] = mean( m[j1,] ) - b0; }
for ( j2 in 1:Nx2Lvl ) { b2[j2] = mean( m[,j2] ) - b0; }
for ( j1 in 1:Nx1Lvl ) { for ( j2 in 1:Nx2Lvl ) {
b1b2[j1,j2] = m[j1,j2] - ( b0 + b1[j1] + b2[j2] );
} }
// transform to original scale:
b0 = meanY + sdY * b0;
b1 = sdY * b1;
b2 = sdY * b2;
b1b2 = sdY * b1b2;
b1Sigma = sdY * a1Sigma;
b2Sigma = sdY * a2Sigma;
b1b2Sigma = sdY * a1a2Sigma;
ySigma = sdY * zySigma;
}"
#Create DSO.
<- stan_model(model_code=modelString) stanDsoANOVA2Way
#If saved DSO is used load it, then run the chains.
# saveRDS(stanDsoANOVA2Way,file="data/stanDso2WayANOVA.Rds")
<- readRDS(file="data/stanDso2WayANOVA.Rds") stanDsoANOVA2Way
Read the data.
The data show faculty salaries of a university.
The columns selected as predictors are:
Position (Assistant Professor, Associate Professor, Full Professor, Full Professor with endowment salary and Distinguished Professor) Department (total 60 departments)
# 20_1: Metric Predicted Variable with Two Nominal Predictors
# load data from 'Salary.csv' (see Kruschke)
<- read.csv("data/Salary.csv")
df mean(df$Salary)
[1] 110452
tail(df)
Org OrgName Cla Pos ClaPos
1075 OADT Business Operations & Decision Technolog PT FT3 PT.FT3
1076 CSCI Computer Science PT FT3 PT.FT3
1077 PHYS Physics PT FT3 PT.FT3
1078 GEOL Geological Sciences PR FT1 PR.FT1
1079 BI Biology PR FT1 PR.FT1
1080 MUST Music School-String Technology PR FT1 PR.FT1
Salary
1075 147400
1076 92500
1077 76748
1078 105786
1079 116592
1080 138022
dim(df)
[1] 1080 6
names(df)
[1] "Org" "OrgName" "Cla" "Pos" "ClaPos" "Salary"
with(df, table(Pos))
Pos
DST FT1 FT2 FT3 NDW
32 360 364 240 84
with(df, table(Org))
Org
ACTG AFRO AMST ANTH APHS AST BEPP BFIN BI BLAN CEDP CEUS CHEM CMCL
19 7 7 23 18 7 9 21 51 7 25 11 35 17
CMLT CRIN CSCI EALC ECON ELPS ENG FINH FINS FOLK FRIT GEOG GEOL GERM
8 23 30 14 20 16 39 9 14 11 14 8 19 9
HIST INFO JOUR KINE LAWS LGED LING MATH MGMT MKTG MUHI MUIN MUST MUTH
35 20 14 15 37 11 8 40 12 13 9 38 14 8
MUVO OADT OPT PHYS PL POLS PSY REL RPAD SLIS SLS SOC SPAN SPEA
10 14 11 32 14 22 41 13 10 13 7 20 21 40
SPHS STAT TELC THTR
15 7 12 13
length(table(df$Org))
[1] 60
<- df$Salary
y <- as.factor(df$Pos)
x1 <- as.factor(df$Org)
x2 str(df)
'data.frame': 1080 obs. of 6 variables:
$ Org : chr "PL" "MUTH" "ENG" "CMLT" ...
$ OrgName: chr "Philosophy" "Music Theory" "English" "Comparative Literature" ...
$ Cla : chr "PC" "PC" "PC" "PC" ...
$ Pos : chr "FT2" "FT2" "FT2" "FT2" ...
$ ClaPos : chr "PC.FT2" "PC.FT2" "PC.FT2" "PC.FT2" ...
$ Salary : int 72395 61017 82370 68805 63796 219600 98814 107745 114275 173302 ...
<- list(Ntotal=length(y),
dataListSalary y=y,
x1=as.integer(x1),
x2=as.integer(x2),
Nx1Lvl=nlevels(x1),
Nx2Lvl=nlevels(x2),
agammaShRa=unlist(gammaShRaFromModeSD(mode=1/2, sd=2)))
Create names of variables and their interactions
<-names(table(df$Pos))) (namesPos
[1] "DST" "FT1" "FT2" "FT3" "NDW"
<-names(table(df$Org))) (namesOrg
[1] "ACTG" "AFRO" "AMST" "ANTH" "APHS" "AST" "BEPP" "BFIN" "BI"
[10] "BLAN" "CEDP" "CEUS" "CHEM" "CMCL" "CMLT" "CRIN" "CSCI" "EALC"
[19] "ECON" "ELPS" "ENG" "FINH" "FINS" "FOLK" "FRIT" "GEOG" "GEOL"
[28] "GERM" "HIST" "INFO" "JOUR" "KINE" "LAWS" "LGED" "LING" "MATH"
[37] "MGMT" "MKTG" "MUHI" "MUIN" "MUST" "MUTH" "MUVO" "OADT" "OPT"
[46] "PHYS" "PL" "POLS" "PSY" "REL" "RPAD" "SLIS" "SLS" "SOC"
[55] "SPAN" "SPEA" "SPHS" "STAT" "TELC" "THTR"
Interactions names:
#use outer() to create names for interactions.
as.vector(outer(1:4,1:2,paste,sep="-"))
[1] "1-1" "2-1" "3-1" "4-1" "1-2" "2-2" "3-2" "4-2"
#apply to our case
<-as.vector(outer(namesOrg,namesPos,paste,sep="-"))) (namesInter
[1] "ACTG-DST" "AFRO-DST" "AMST-DST" "ANTH-DST" "APHS-DST"
[6] "AST-DST" "BEPP-DST" "BFIN-DST" "BI-DST" "BLAN-DST"
[11] "CEDP-DST" "CEUS-DST" "CHEM-DST" "CMCL-DST" "CMLT-DST"
[16] "CRIN-DST" "CSCI-DST" "EALC-DST" "ECON-DST" "ELPS-DST"
[21] "ENG-DST" "FINH-DST" "FINS-DST" "FOLK-DST" "FRIT-DST"
[26] "GEOG-DST" "GEOL-DST" "GERM-DST" "HIST-DST" "INFO-DST"
[31] "JOUR-DST" "KINE-DST" "LAWS-DST" "LGED-DST" "LING-DST"
[36] "MATH-DST" "MGMT-DST" "MKTG-DST" "MUHI-DST" "MUIN-DST"
[41] "MUST-DST" "MUTH-DST" "MUVO-DST" "OADT-DST" "OPT-DST"
[46] "PHYS-DST" "PL-DST" "POLS-DST" "PSY-DST" "REL-DST"
[51] "RPAD-DST" "SLIS-DST" "SLS-DST" "SOC-DST" "SPAN-DST"
[56] "SPEA-DST" "SPHS-DST" "STAT-DST" "TELC-DST" "THTR-DST"
[61] "ACTG-FT1" "AFRO-FT1" "AMST-FT1" "ANTH-FT1" "APHS-FT1"
[66] "AST-FT1" "BEPP-FT1" "BFIN-FT1" "BI-FT1" "BLAN-FT1"
[71] "CEDP-FT1" "CEUS-FT1" "CHEM-FT1" "CMCL-FT1" "CMLT-FT1"
[76] "CRIN-FT1" "CSCI-FT1" "EALC-FT1" "ECON-FT1" "ELPS-FT1"
[81] "ENG-FT1" "FINH-FT1" "FINS-FT1" "FOLK-FT1" "FRIT-FT1"
[86] "GEOG-FT1" "GEOL-FT1" "GERM-FT1" "HIST-FT1" "INFO-FT1"
[91] "JOUR-FT1" "KINE-FT1" "LAWS-FT1" "LGED-FT1" "LING-FT1"
[96] "MATH-FT1" "MGMT-FT1" "MKTG-FT1" "MUHI-FT1" "MUIN-FT1"
[101] "MUST-FT1" "MUTH-FT1" "MUVO-FT1" "OADT-FT1" "OPT-FT1"
[106] "PHYS-FT1" "PL-FT1" "POLS-FT1" "PSY-FT1" "REL-FT1"
[111] "RPAD-FT1" "SLIS-FT1" "SLS-FT1" "SOC-FT1" "SPAN-FT1"
[116] "SPEA-FT1" "SPHS-FT1" "STAT-FT1" "TELC-FT1" "THTR-FT1"
[121] "ACTG-FT2" "AFRO-FT2" "AMST-FT2" "ANTH-FT2" "APHS-FT2"
[126] "AST-FT2" "BEPP-FT2" "BFIN-FT2" "BI-FT2" "BLAN-FT2"
[131] "CEDP-FT2" "CEUS-FT2" "CHEM-FT2" "CMCL-FT2" "CMLT-FT2"
[136] "CRIN-FT2" "CSCI-FT2" "EALC-FT2" "ECON-FT2" "ELPS-FT2"
[141] "ENG-FT2" "FINH-FT2" "FINS-FT2" "FOLK-FT2" "FRIT-FT2"
[146] "GEOG-FT2" "GEOL-FT2" "GERM-FT2" "HIST-FT2" "INFO-FT2"
[151] "JOUR-FT2" "KINE-FT2" "LAWS-FT2" "LGED-FT2" "LING-FT2"
[156] "MATH-FT2" "MGMT-FT2" "MKTG-FT2" "MUHI-FT2" "MUIN-FT2"
[161] "MUST-FT2" "MUTH-FT2" "MUVO-FT2" "OADT-FT2" "OPT-FT2"
[166] "PHYS-FT2" "PL-FT2" "POLS-FT2" "PSY-FT2" "REL-FT2"
[171] "RPAD-FT2" "SLIS-FT2" "SLS-FT2" "SOC-FT2" "SPAN-FT2"
[176] "SPEA-FT2" "SPHS-FT2" "STAT-FT2" "TELC-FT2" "THTR-FT2"
[181] "ACTG-FT3" "AFRO-FT3" "AMST-FT3" "ANTH-FT3" "APHS-FT3"
[186] "AST-FT3" "BEPP-FT3" "BFIN-FT3" "BI-FT3" "BLAN-FT3"
[191] "CEDP-FT3" "CEUS-FT3" "CHEM-FT3" "CMCL-FT3" "CMLT-FT3"
[196] "CRIN-FT3" "CSCI-FT3" "EALC-FT3" "ECON-FT3" "ELPS-FT3"
[201] "ENG-FT3" "FINH-FT3" "FINS-FT3" "FOLK-FT3" "FRIT-FT3"
[206] "GEOG-FT3" "GEOL-FT3" "GERM-FT3" "HIST-FT3" "INFO-FT3"
[211] "JOUR-FT3" "KINE-FT3" "LAWS-FT3" "LGED-FT3" "LING-FT3"
[216] "MATH-FT3" "MGMT-FT3" "MKTG-FT3" "MUHI-FT3" "MUIN-FT3"
[221] "MUST-FT3" "MUTH-FT3" "MUVO-FT3" "OADT-FT3" "OPT-FT3"
[226] "PHYS-FT3" "PL-FT3" "POLS-FT3" "PSY-FT3" "REL-FT3"
[231] "RPAD-FT3" "SLIS-FT3" "SLS-FT3" "SOC-FT3" "SPAN-FT3"
[236] "SPEA-FT3" "SPHS-FT3" "STAT-FT3" "TELC-FT3" "THTR-FT3"
[241] "ACTG-NDW" "AFRO-NDW" "AMST-NDW" "ANTH-NDW" "APHS-NDW"
[246] "AST-NDW" "BEPP-NDW" "BFIN-NDW" "BI-NDW" "BLAN-NDW"
[251] "CEDP-NDW" "CEUS-NDW" "CHEM-NDW" "CMCL-NDW" "CMLT-NDW"
[256] "CRIN-NDW" "CSCI-NDW" "EALC-NDW" "ECON-NDW" "ELPS-NDW"
[261] "ENG-NDW" "FINH-NDW" "FINS-NDW" "FOLK-NDW" "FRIT-NDW"
[266] "GEOG-NDW" "GEOL-NDW" "GERM-NDW" "HIST-NDW" "INFO-NDW"
[271] "JOUR-NDW" "KINE-NDW" "LAWS-NDW" "LGED-NDW" "LING-NDW"
[276] "MATH-NDW" "MGMT-NDW" "MKTG-NDW" "MUHI-NDW" "MUIN-NDW"
[281] "MUST-NDW" "MUTH-NDW" "MUVO-NDW" "OADT-NDW" "OPT-NDW"
[286] "PHYS-NDW" "PL-NDW" "POLS-NDW" "PSY-NDW" "REL-NDW"
[291] "RPAD-NDW" "SLIS-NDW" "SLS-NDW" "SOC-NDW" "SPAN-NDW"
[296] "SPEA-NDW" "SPHS-NDW" "STAT-NDW" "TELC-NDW" "THTR-NDW"
#all names:
<-c("Intercept", namesPos, namesOrg, namesInter, rep("Var",5)) #why need to rep Var 5 times varNames
# fit model
<- sampling(stanDsoANOVA2Way,
fit data=dataListSalary,
pars=c('b0',
'b1',
'b2',
'b1b2',
'b1Sigma',
'b2Sigma',
'b1b2Sigma',
'ySigma'),
iter=5000, chains = 2, cores = 4)
# save(fit, file = "data/fitinstan1.Rdata")
load("data/fitinstan1.Rdata")
launch_shinystan(fit)
Create results including mean value, 2.5%, 50% and 97.5% quantiles. Add variable names as row names.
<- summary(fit)$summary[,c(1,4,6,8)]
SalaryResults nrow(SalaryResults)-(4:0)] <- rownames(SalaryResults)[nrow(SalaryResults)-(4:0)]
varNames[rownames(SalaryResults) <- varNames
head(SalaryResults)
mean 2.5% 50% 97.5%
Intercept 127090.073 124721.828 127081.992 129373.0577
DST 55548.126 48387.941 55571.959 62569.1969
FT1 -3101.583 -5949.631 -3126.253 -148.3999
FT2 -33047.754 -35838.531 -33046.367 -30203.8486
FT3 -46367.618 -49353.221 -46385.131 -43252.7522
NDW 26968.829 22202.907 26981.535 31482.3894
Make plots of mean values and HDIs.
plot(fit,pars=c("b1"))
plot(fit,pars=c('b2'))
plot(fit,pars=c("b1b2"))
Plots show that not all coefficients of the model equal zero. This answers the question of utility test.
Extract chains for the position variables.
<- rstan::extract(fit)
fit_ext names(fit_ext)
[1] "b0" "b1" "b2" "b1b2" "b1Sigma"
[6] "b2Sigma" "b1b2Sigma" "ySigma" "lp__"
<-fit_ext$b0
fit_ext.b0<-fit_ext$b1
fit_ext.b1colnames(fit_ext.b1) <- namesPos
head(fit_ext.b1)
iterations DST FT1 FT2 FT3 NDW
[1,] 53643.51 -2464.063 -32308.67 -48308.55 29437.76
[2,] 58738.28 -1706.306 -33827.16 -46158.53 22953.73
[3,] 55505.48 -3764.629 -31670.52 -44799.95 24729.62
[4,] 53497.58 -4276.004 -31554.91 -48626.05 30959.38
[5,] 57104.01 -2315.971 -34761.95 -48452.81 28426.72
[6,] 57498.84 -2508.874 -33852.21 -47274.01 26136.25
Extract chains for the department variables.
<- fit_ext$b2
fit_ext.b2 colnames(fit_ext.b2) <- namesOrg
head(fit_ext.b2)
iterations ACTG AFRO AMST ANTH APHS
[1,] 84518.72 -10683.803 -31817.1077 -4720.261 3396.7104
[2,] 88156.40 -12387.445 836.3933 -20495.377 -11482.3619
[3,] 81611.69 -13618.307 -28257.8104 -18097.996 11482.2086
[4,] 79194.53 -21153.965 -15077.6255 -21260.863 -125.4459
[5,] 85622.17 -25296.108 -18530.4846 -20764.400 564.3309
[6,] 82473.44 -3693.756 -14717.9823 -22886.365 6663.3884
iterations AST BEPP BFIN BI BLAN CEDP
[1,] -272.2711 56966.20 113408.94 -6397.807 24797.14 -23240.80
[2,] 8032.0282 50815.51 108789.31 -2003.753 23891.55 -22317.59
[3,] -4209.3496 54267.00 105146.37 -9549.098 14526.17 -11514.01
[4,] 8876.8401 30764.92 99360.17 2224.069 15125.27 -20714.19
[5,] 8044.0916 46878.48 118933.83 -2619.729 10774.34 -13026.26
[6,] 7926.3391 46454.59 109490.48 -2284.006 24635.29 -21256.39
iterations CEUS CHEM CMCL CMLT CRIN
[1,] -21339.84 21278.46 -16396.336 -13661.83 -28726.73
[2,] -18465.13 14471.89 2961.902 -37086.56 -29609.02
[3,] -19574.64 23867.96 -14992.093 -22798.07 -20194.33
[4,] -25334.82 21066.61 -12840.202 -21092.47 -17214.27
[5,] -20353.06 20205.76 -18418.019 -29499.71 -23319.70
[6,] -19667.90 14114.09 -6386.052 -16853.90 -14670.81
iterations CSCI EALC ECON ELPS ENG
[1,] 19070.861 -22627.064 53690.70 -2256.9313 -23618.76
[2,] 6341.859 -16326.435 63001.24 -6692.3318 -24588.41
[3,] 5777.751 -24292.943 60270.28 -11446.9859 -22431.50
[4,] 19402.512 -25074.484 56716.46 -865.0668 -21256.75
[5,] 13926.101 -11248.245 45142.49 -5962.1880 -19477.31
[6,] 14513.160 -3620.939 57299.77 -7540.6420 -12163.67
iterations FINH FINS FOLK FRIT GEOG
[1,] -14346.30 -18209.02 -29875.32 -21655.118 -5639.289
[2,] -15715.47 -20946.13 -23147.34 -23395.993 -16202.811
[3,] -26414.60 -22185.23 -17332.30 -14807.023 -18898.645
[4,] -13383.51 -21330.12 -15269.66 -16003.793 -9272.097
[5,] -23487.76 -14402.51 -20000.94 -15901.330 -5976.184
[6,] -16291.29 -24708.77 -21488.15 -2759.774 -15596.673
iterations GEOL GERM HIST INFO JOUR
[1,] -22550.037 -23929.79 -11922.908 14948.452 -24063.132
[2,] -7012.096 -51561.03 -9361.597 6416.838 -26205.125
[3,] -10887.379 -32162.35 -9143.212 4342.127 -9072.310
[4,] -16233.336 -23952.16 -9301.820 15654.625 -23874.915
[5,] -21942.573 -35140.01 -15107.426 15316.580 -15393.027
[6,] -4562.365 -30880.93 -16131.613 7988.985 -9296.838
iterations KINE LAWS LGED LING MATH
[1,] -9193.225 38785.02 -17528.29 -17190.155 -8225.30392
[2,] -6282.942 47666.25 -17899.19 -13565.903 -7706.18824
[3,] -9910.339 45848.64 -27964.54 -6617.547 -1768.38643
[4,] -11922.797 33108.61 -18500.86 2572.191 -3658.10736
[5,] -2202.068 39704.44 -14898.99 -15872.538 -8693.52608
[6,] -5544.008 38161.16 -33744.68 -9544.743 -85.51853
iterations MGMT MKTG MUHI MUIN MUST
[1,] 62113.34 64268.19 -32968.88 -20481.139 -844.2042
[2,] 62136.44 67642.69 -27573.69 -18587.600 -6443.1771
[3,] 74127.78 71343.24 -34818.65 -5149.272 -998.3260
[4,] 59756.08 62986.43 -24386.81 -27655.967 7488.0542
[5,] 63347.38 58431.11 -27734.24 -23368.999 7810.1252
[6,] 47017.26 69201.49 -35955.53 -19428.976 -6371.6317
iterations MUTH MUVO OADT OPT PHYS
[1,] -27687.26 -29106.00 53817.90 -13708.67 938.1202
[2,] -37874.35 -31662.78 57211.89 -17478.04 -5738.0122
[3,] -28434.64 -33955.67 60768.24 -4218.98 3772.0028
[4,] -41578.53 -26331.99 52976.79 -21015.56 6144.4542
[5,] -21352.16 -24057.30 55091.18 -14208.71 -3552.6289
[6,] -35308.73 -39279.62 54524.81 -13330.94 -2781.7758
iterations PL POLS PSY REL RPAD
[1,] -11753.736 3513.843 7017.447 -6453.305 -3187.879
[2,] -8796.469 15546.737 8450.865 -14424.987 -12159.452
[3,] -9792.748 5648.549 9457.655 -12918.854 -4778.571
[4,] 3873.790 7563.929 3148.918 -14619.568 -6131.743
[5,] -19536.730 10201.619 10906.627 -1860.096 -2935.141
[6,] -2805.521 4428.878 9198.403 -12073.818 -1760.722
iterations SLIS SLS SOC SPAN SPEA
[1,] 12208.8805 -17676.380 -14610.381 -26932.70 12609.29
[2,] 3209.0633 -152.842 -10660.845 -11541.33 21244.00
[3,] 6337.0940 -22995.180 -3894.261 -28734.78 13156.72
[4,] 6231.9672 -17295.152 -5021.769 -16128.08 15097.28
[5,] 849.9127 -21943.215 -5528.208 -10620.69 12476.60
[6,] 8901.4092 -26578.505 -5382.470 -21548.77 16643.46
iterations SPHS STAT TELC THTR
[1,] 12925.959 10388.95421 -2909.144 -32256.04
[2,] 8928.522 6829.27323 -17467.685 -11563.16
[3,] 2515.913 963.79821 -8621.868 -27778.40
[4,] 2017.494 -229.95943 -7111.211 -19132.31
[5,] 2956.340 79.86284 -12122.435 -20908.74
[6,] 8378.987 -8767.91555 -27142.360 -33120.35
Extract chains for interaction variables.
<-fit_ext$b1b2
fit_ext.b1.b2dim(fit_ext.b1.b2)
[1] 5000 5 60
Interaction chains make a cube.
dimnames(fit_ext.b1.b2)[[2]]<-namesPos
dimnames(fit_ext.b1.b2)[[3]]<-namesOrg
dimnames(fit_ext.b1.b2)
$iterations
NULL
[[2]]
[1] "DST" "FT1" "FT2" "FT3" "NDW"
[[3]]
[1] "ACTG" "AFRO" "AMST" "ANTH" "APHS" "AST" "BEPP" "BFIN" "BI"
[10] "BLAN" "CEDP" "CEUS" "CHEM" "CMCL" "CMLT" "CRIN" "CSCI" "EALC"
[19] "ECON" "ELPS" "ENG" "FINH" "FINS" "FOLK" "FRIT" "GEOG" "GEOL"
[28] "GERM" "HIST" "INFO" "JOUR" "KINE" "LAWS" "LGED" "LING" "MATH"
[37] "MGMT" "MKTG" "MUHI" "MUIN" "MUST" "MUTH" "MUVO" "OADT" "OPT"
[46] "PHYS" "PL" "POLS" "PSY" "REL" "RPAD" "SLIS" "SLS" "SOC"
[55] "SPAN" "SPEA" "SPHS" "STAT" "TELC" "THTR"
1,,] fit_ext.b1.b2[
ACTG AFRO AMST ANTH APHS AST
DST 3539.479 6393.899 1304.058 15763.611 -10799.0416 7927.158
FT1 -16326.820 -7421.503 -6828.542 -9787.681 11662.4623 -2611.558
FT2 9626.651 -4519.507 3486.851 -9088.313 -2036.0353 2425.408
FT3 8239.038 -2385.126 10078.649 3483.160 -975.3569 -13603.240
NDW -5078.347 7932.237 -8041.017 -370.778 2147.9715 5862.232
BEPP BFIN BI BLAN CEDP CEUS
DST 8728.384 -2330.6655 -2834.672 -5466.1952 -15580.544 -2080.820
FT1 -3538.370 976.2805 4358.263 -2541.3345 -1900.758 1838.315
FT2 4630.186 12267.9786 -5681.101 -673.3443 4958.960 4207.809
FT3 -2312.500 1773.3809 12940.011 1518.5936 9637.629 5874.503
NDW -7507.699 -12686.9744 -8782.501 7162.2803 2884.713 -9839.807
CHEM CMCL CMLT CRIN CSCI
DST 3002.086 -9440.742 -86.93006 -9140.478 3129.8086
FT1 11524.040 14411.093 -2557.61581 -3161.343 -2309.5690
FT2 -5293.531 -2668.087 -16853.21654 6077.927 -7840.4627
FT3 -22330.841 -7414.901 7430.17731 10001.465 422.3283
NDW 13098.245 5112.637 12067.58510 -3777.572 6597.8948
EALC ECON ELPS ENG FINH
DST -4167.890 -258.8566 -721.5942 -692.6053 -4837.1791
FT1 7427.198 5254.3375 3062.8163 2486.2253 -9159.9057
FT2 6582.357 12117.9070 -4679.8036 5590.4990 17244.9910
FT3 -3181.338 -22080.7613 -2932.5083 4007.1760 -2496.7910
NDW -6660.328 4967.3734 5271.0899 -11391.2951 -751.1152
FINS FOLK FRIT GEOG GEOL
DST 325.8752 -3893.956 -6294.419 6675.0396 5774.6693
FT1 535.5009 -7072.318 17389.175 7479.6932 12352.3715
FT2 -328.2188 4064.460 -15657.924 -6928.5509 2648.6372
FT3 -7741.0741 9117.089 6343.980 -701.5859 466.6109
NDW 7207.9169 -2215.275 -1780.812 -6524.5961 -21242.2889
GERM HIST INFO JOUR KINE
DST 4706.445 -8527.565 3237.7335 -1757.081 -10582.306
FT1 -2672.307 -2134.152 4940.7638 -3147.530 14339.188
FT2 -5330.064 -3001.392 -12774.4001 4272.784 -3333.583
FT3 9323.907 -5018.605 5237.0730 9447.929 7535.690
NDW -6027.980 18681.715 -641.1703 -8816.101 -7958.989
LAWS LGED LING MATH MGMT
DST -10015.5598 4591.43556 -7204.124 -11581.540 -3716.644
FT1 3412.6992 -587.59280 5405.700 -9752.037 -1804.589
FT2 6352.9234 -399.63708 -14028.469 4526.424 7716.997
FT3 -899.5615 -10.67023 -4023.939 12426.506 -5186.786
NDW 1149.4988 -3593.53544 19850.832 4380.648 2991.022
MKTG MUHI MUIN MUST MUTH
DST 12502.3899 -7724.675 3129.524 7886.1204 -9039.427
FT1 -16194.6842 3408.667 -4813.088 5061.4779 12022.885
FT2 -2920.6533 -3107.146 3336.682 -2460.8449 -1193.335
FT3 768.3334 6316.702 9569.930 -10862.8928 -6529.798
NDW 5844.6142 1106.452 -11223.048 376.1394 4739.675
MUVO OADT OPT PHYS PL POLS
DST -9946.788 -888.6629 1872.653 -3744.574 3435.746 -6711.089
FT1 8902.435 -10644.0479 -12866.850 5992.883 1352.037 1818.654
FT2 -7975.997 17648.2569 5249.675 14873.035 -9887.859 -4523.008
FT3 12665.367 3657.3418 -3987.251 -13554.341 12865.421 -2919.634
NDW -3645.017 -9772.8880 9731.773 -3567.002 -7765.346 12335.077
PSY REL RPAD SLIS SLS
DST 11700.3240 -6837.7256 9313.103 15054.896 8808.297
FT1 -15963.5566 -688.5178 -10034.632 9249.393 -3786.320
FT2 -12322.9888 -4672.9437 7567.516 -11362.755 -6045.782
FT3 525.9758 -7990.0302 -13929.224 -9658.127 1906.952
NDW 16060.2456 20189.2173 7083.237 -3283.406 -883.147
SOC SPAN SPEA SPHS STAT
DST -6806.699 1782.323470 12803.000 13314.603 2412.6612
FT1 -5621.186 1570.915735 4660.468 -6725.916 1597.2451
FT2 2436.551 9372.847478 2481.490 -6071.677 755.1363
FT3 1224.220 4.001735 1103.587 -10906.537 -4197.6978
NDW 8767.113 -12730.088418 -21048.545 10389.527 -567.3448
TELC THTR
DST -5399.985 9995.711
FT1 -7969.306 6130.450
FT2 3291.536 7848.155
FT3 -2627.908 4546.298
NDW 12705.663 -28520.613
Rows correspond to iterations of MCMC.
For each iteration there is interactions table between schools and
positions.
Salary of a school across positions is base level beta, plus school
beta: \(\beta_0+\beta_{school}\).
Salary of a position across schools is base level beta, plus position
beta: \(\beta_0+\beta_{position}\).
Salary of a certain position of a certain school is predicted as base
level beta, plus school beta, plus position beta, plus the interaction
beta: \(\beta_0+\beta_{school}+\beta_{position}+\beta_{school
\ \times \ position}\).
To answer each of the following 4 questions do the following:
hdi()
from library HDInterval
Use contrasts to compare salaries at Business and Finance (“BFIN”) with Physics (“PHYS”) and with Chemistry (“CHEM”) departments.
To do that select columns of MCMC for departments to “BFIN” and “PHYS” and “BFIN” and “CHEM”, take their differences and look at the posterior distribution of the differences
<- fit_ext.b2[,"BFIN"]-fit_ext.b2[,"PHYS"]
contrast.BFIN.PHYS hist(contrast.BFIN.PHYS)
mean(contrast.BFIN.PHYS)
[1] 111653.5
hdi(contrast.BFIN.PHYS)
lower upper
99528.41 124555.65
attr(,"credMass")
[1] 0.95
What do we conclude about the differences of salaries?
Plot histograms of salaries for both departments. Salaries
BFINComp
of Business and Finance and PHYSComp
of Physics are calculate as described above.
<- fit_ext.b2[,"BFIN"]
BFINComp <- fit_ext.b2[,"PHYS"] PHYSComp
<- density(BFINComp)
distrBFIN <- density(PHYSComp)
distrPHYS
plot(distrPHYS,xlim=c(-20000,200000), lwd=2, main="Salaries Distributions", col="blue")
lines(distrBFIN$x, distrPHYS$y, lwd=2, col="orange")
legend("top", legend = c("PHYS","BFIN"), col = c("blue","orange"), lty=1, lwd=2)
Do the same comparison for finance professors and chemistry professors.
<-fit_ext.b2[,"BFIN"]-fit_ext.b2[,"CHEM"]
contrast.BFIN.CHEMhist(contrast.BFIN.CHEM)
mean(contrast.BFIN.CHEM)
[1] 89933.55
hdi(contrast.BFIN.CHEM)
lower upper
78655.11 101160.71
attr(,"credMass")
[1] 0.95
<- fit_ext.b2[,"BFIN"]
BFINComp <- fit_ext.b2[,"CHEM"] CHEMComp
<- density(BFINComp)
distrBFIN <- density(PHYSComp)
distrPHYS
plot(distrPHYS,xlim=c(-20000,200000), lwd=2, main="Salaries Distributions", col="blue")
lines(distrBFIN$x, distrPHYS$y, lwd=2, col="orange")
legend("top", legend = c("PHYS","BFIN"), col = c("blue","orange"), lty=1, lwd=2)
Use contrasts to compare salaries of Endowment Full Professor (“NDW”) and Distinguished Full Professor (“DST”). Compare salaries of Full Professor (“FT1”) and Endowment Full Professor
<- fit_ext.b1[,"NDW"] - fit_ext.b1[,"DST"]
contrast.NDW.DST hist(contrast.NDW.DST)
mean(contrast.NDW.DST)
[1] -28579.3
hdi(contrast.NDW.DST)
lower upper
-38342.94 -17938.62
attr(,"credMass")
[1] 0.95
<- fit_ext.b1[,"NDW"]
DSTComp <- fit_ext.b1[,"DST"]
NDWComp
<-density(DSTComp)
distrDST<-density(NDWComp)
distrNDWplot(distrDST,xlim=c(14000,80000),lwd=2,main="Salaries Distributions",col="blue")
lines(distrNDW$x,distrPHYS$y,lwd=2,col="orange")
legend("top",legend=c("DST","NDW"),col=c("blue","orange"),lty=1,lwd=2)
Salary of a Distinguished Full Professor is significantly higher than salary of Endowment Full Professor.
Analyze difference between salaries of Full Professor (“FT1”) and Endowment Full Professor
<-fit_ext.b1[,"NDW"]-fit_ext.b1[,"FT1"]
contrast.NDW.FT1hist(contrast.NDW.FT1)
mean(contrast.NDW.FT1)
[1] 30070.41
hdi(contrast.NDW.FT1)
lower upper
23860.41 35549.30
attr(,"credMass")
[1] 0.95
Use contrasts to compare salaries spreads between Full Professor and Assistant Professor at Physics Department and at Chemistry Department.
<-fit_ext.b1[,"FT1"]+fit_ext.b1.b2[,"FT1","PHYS"]-
spreadPhys"FT3"]+fit_ext.b1.b2[,"FT3","PHYS"])
(fit_ext.b1[,<-fit_ext.b1[,"FT1"]+fit_ext.b1.b2[,"FT1","CHEM"]-
spreadChem"FT3"]+fit_ext.b1.b2[,"FT3","CHEM"])
(fit_ext.b1[,<-spreadPhys-spreadChem
spreadhist(spread)
mean(spread)
[1] -15530.47
hdi(spread)
lower upper
-35565.326 3157.709
attr(,"credMass")
[1] 0.95
Find the highest HDI level for which the spread of differences between “FT1” and “FT3” is significant.
hdi(spread,.88)
lower upper
-30176.5124 -661.9843
attr(,"credMass")
[1] 0.88
<-seq(from=.8,to=.95,by=.01)
tryLevels<-sapply(tryLevels,function(z) hdi(spread,z))
hdis<-tail(tryLevels[hdis[2,]<0],1)
maxLevelhdi(spread,maxLevel)
lower upper
-30176.5124 -661.9843
attr(,"credMass")
[1] 0.88
Nonlinear transformations may affect interactions very significantly.
Illustrate it on a simple simulated example.
<-1
mean00<-3
mean10<-4
mean01<-6
mean11<-rnorm(5,mean00,.1)
y00<-rnorm(5,mean10,.1)
y10<-rnorm(5,mean01,.1)
y01<-rnorm(5,mean11,.1) y11
Plot the effects. If the lines are parallel the effects are additive.
plot(c(0,1),c(mean(y00),mean(y10)),type="b",ylim=c(1,8),col="darkgreen",lwd=3,ylab="Response",xlab="Predictor 1")
lines(c(0,1),c(mean(y01),mean(y11)),type="b",col="lightblue",lwd=3)
legend("topleft",legend=c("Predictor2 at 0","Predictor2 at 1"),lty=1,lwd=3,col=c("darkgreen","lightblue"))
Taking exponent of the same data introduces significant interaction.
plot(c(0,1),c(mean(exp(y00)),mean(exp(y10))),type="b",ylim=c(1,400),col="darkgreen",lwd=3,ylab="Response",xlab="Predictor 1")
lines(c(0,1),c(mean(exp(y01)),mean(exp(y11))),type="b",col="lightblue",lwd=3)
legend("topleft",legend=c("Predictor2 at 0","Predictor2 at 1"),lty=1,lwd=3,col=c("darkgreen","lightblue"))
We cannot use aov
in ANCOVA because it uses Type I SS,
instead we should use Type II SS to get the correct results.
<- aov(Longevity ~ CompanionNumber + Thorax, data = dta)
longevity.aov2 # car Anova command on our longevity.aov2 object,
#summary(longevity.aov2) #produces incorrect results
Anova(longevity.aov2, type="II")
Anova Table (Type II tests)
Response: Longevity
Sum Sq Df F value Pr(>F)
CompanionNumber 9611.5 4 21.753 1.719e-13 ***
Thorax 13168.9 1 119.219 < 2.2e-16 ***
Residuals 13144.7 119
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
The covariate, Thorax
, was significantly related to the
fly’s longevity, F(1,119)=119.22, p<.001. There was also a
significant effect of the Companion Number on the Longevity after
controlling for the effect of the Thorax, F(4,119)=21.753,
p<.001.
Data show significant variance within groups.
In this example the subjects of the experiment were in different physical conditions which contributed to large variation of life expectancy.
A metric predictor that is available is the size of the species.
Create data with both predictors.
<-list(Ntotal=nrow(dta),
dataList2y=dta$Longevity,
xMet=dta$Thorax,
xNom=as.integer(as.factor(dta$CompanionNumber)),
NxLvl=nlevels(as.factor(dta$CompanionNumber)),
agammaShRa=unlist(gammaShRaFromModeSD(mode=sd(dta$Longevity)/2,
sd=2*sd(dta$Longevity))))
<-"
modelStringdata {
int<lower=1> Ntotal;
real y[Ntotal];
int<lower=2> NxLvl;
int<lower=1, upper=NxLvl> xNom[Ntotal];
real xMet[Ntotal];
real<lower=0> agammaShRa[2];
}
transformed data {
real meanY;
real sdY;
real xMetMean;
real xMetSD;
meanY = mean(y);
sdY = sd(y);
xMetMean = mean(xMet);
xMetSD = sd(xMet);
}
parameters {
real a0;
real<lower=0> aSigma;
vector[NxLvl] a;
real aMet;
real<lower=0> ySigma;
}
model {
a0 ~ normal(meanY, 5*sdY);
aSigma ~ gamma(agammaShRa[1], agammaShRa[2]);
a ~ normal(0, aSigma);
aMet ~ normal(0, 2*sdY/xMetSD);
ySigma ~ uniform(sdY/100, sdY*10);
for ( i in 1:Ntotal ) {
y[i] ~ normal(a0 + a[xNom[i]] + aMet*(xMet[i] - xMetMean), ySigma);
}
}
generated quantities {
// Convert a0,a[] to sum-to-zero b0,b[] :
real b0;
vector[NxLvl] b;
b0 = a0 + mean(a) - aMet * xMetMean;
b = a - mean(a);
}
"
<- stan_model(model_code=modelString) model2
If saved DSO is used load it, then run the chains.
# saveRDS(model2, file="data/stanDSO2.Rds")
<- readRDS("data/stanDSO2.Rds") model2
Run MCMC.
<- sampling(model2,
fit2 data=dataList2,
pars=c('b0', 'b', 'aMet', 'aSigma', 'ySigma'),
iter=5000, chains = 2, cores = 4)
launch_shinystan(fit2)
Calculate same contrasts for betas:
<- rstan::extract(fit2)
fit_ext2 head(fit_ext2$b)
iterations [,1] [,2] [,3] [,4] [,5]
[1,] 4.466494 6.207247 3.940179 -0.8029348 -13.81098
[2,] 7.582592 4.825399 6.512522 -5.6358837 -13.28463
[3,] 6.948511 7.865498 6.695544 -6.7136045 -14.79595
[4,] 4.922978 7.802200 6.146581 -5.6397349 -13.23202
[5,] 7.228799 5.271264 6.506355 -5.4617319 -13.54469
[6,] 3.844551 8.005606 6.024163 -5.0992662 -12.77505
<- fit_ext2$b[,1] - fit_ext2$b[,5]
contrast2_1_5 plot(contrast2_1_5)
hist(contrast2_1_5)
<-hdi(contrast2_1_5)) (hdiContrast2_1_5
lower upper
13.40625 25.09579
attr(,"credMass")
[1] 0.95
<-sd(contrast2_1_5)) (sd.contrast2_1_5
[1] 2.995819
plot(rank(fit_ext2$b[,1]),rank(fit_ext2$b[,5]))
<-(fit_ext2$b[,1] + fit_ext2$b[,2] + fit_ext2$b[,3])/3
Comb1<-(fit_ext2$b[,4] + fit_ext2$b[,5])/2
Comb2<-Comb1-Comb2
contrast2_123_45 plot(contrast2_123_45)
hist(contrast2_123_45)
<-hdi(contrast2_123_45)) (hdiContrast2_123_45
lower upper
11.28173 18.74789
attr(,"credMass")
[1] 0.95
<-sd(contrast2_123_45)) (sd.contrast2_123_45
[1] 1.910106
plot(rank(Comb1),rank(Comb2))
Compare the models.
Using standard deviation:
rbind(WithoutMetricPred=c(sd.contrast_1_5,sd.contrast_123_45),WithMetricPred=c(sd.contrast2_1_5,sd.contrast2_123_45))
[,1] [,2]
WithoutMetricPred 4.256989 2.734301
WithMetricPred 2.995819 1.910106
rbind(WithoutMetricPred=c(hdiContrast_1_5,hdiContrast_123_45),WithMetricPred=c(hdiContrast2_1_5,hdiContrast2_123_45))
lower upper lower upper
WithoutMetricPred 15.21982 31.95605 9.713508 20.41661
WithMetricPred 13.40625 25.09579 11.281726 18.74789
Using HDI width:
rbind(WithoutMetricPred=c(Contrast_1_5=unname(diff(hdiContrast_1_5)),
Contrast_123_45=unname(diff(hdiContrast_123_45))),
WithMetricPred=c(Contrast_1_5=diff(hdiContrast2_1_5),
Contrast_123_45=diff(hdiContrast2_123_45)))
Contrast_1_5 Contrast_123_45
WithoutMetricPred 16.73622 10.703103
WithMetricPred 11.68954 7.466168
Obviously, the width was shrinkage so with the metric predictor in the model, we can explain more the variability of Longevity due to the difference among groups and Thorax.
The most restrictive assumption of ANOVA is equivalence of variances in all groups.
Look how Bayesian approach can remove this assumption.
On the following model diagram there is an implementation of model with heterogeneous variances.
The model uses t-distributed noise and allows each group to have its own standard deviation.
Data set InsectSprays
from datasets
shows
test of 6 different insect sprays. Column count contains numbers of
insects found in the field after each spraying. Column spray identifies
the type of spray.
Prepare the data.
<- InsectSprays
df head(df)
count spray
1 10 A
2 7 A
3 20 A
4 14 A
5 14 A
6 12 A
levels(df$spray)
[1] "A" "B" "C" "D" "E" "F"
<- list(Ntotal = nrow(df),
dataListSprays y = df$count,
x = as.integer(df$spray),
NxLvl = nlevels(df$spray),
aGammaShRa = unlist(gammaShRaFromModeSD(mode=sd(df$count)/2,
sd=2*sd(df$count))))
Plot the data.
plot(df$count ~ df$spray)
Note that variances within the groups are very different. If estimated overall the variance for groups “C”, “D” and “E” is overestimated and for “A” “B” and “F” - underestimated.
<- lm(count~spray, df)
m1 summary(m1)
Call:
lm(formula = count ~ spray, data = df)
Residuals:
Min 1Q Median 3Q Max
-8.333 -1.958 -0.500 1.667 9.333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 14.5000 1.1322 12.807 < 2e-16 ***
sprayB 0.8333 1.6011 0.520 0.604
sprayC -12.4167 1.6011 -7.755 7.27e-11 ***
sprayD -9.5833 1.6011 -5.985 9.82e-08 ***
sprayE -11.0000 1.6011 -6.870 2.75e-09 ***
sprayF 2.1667 1.6011 1.353 0.181
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.922 on 66 degrees of freedom
Multiple R-squared: 0.7244, Adjusted R-squared: 0.7036
F-statistic: 34.7 on 5 and 66 DF, p-value: < 2.2e-16
anova(m1)
Analysis of Variance Table
Response: count
Df Sum Sq Mean Sq F value Pr(>F)
spray 5 2668.8 533.77 34.702 < 2.2e-16 ***
Residuals 66 1015.2 15.38
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#m.aov <- aov(count~spray, df)
#summary(m.aov)
The categorical factor is significant, i.e. utility test fails.
Create contrasts to identify differences between combinations of sprays.
<- factor(df$spray)
A contrasts(A)
B C D E F
A 0 0 0 0 0
B 1 0 0 0 0
C 0 1 0 0 0
D 0 0 1 0 0
E 0 0 0 1 0
F 0 0 0 0 1
contrasts(A)<-cbind("GA vs GB"=c(1,-1,0,0,0,0),
"GAB vs GF"=c(1,1,0,0,0,-2),
"GC vs GD"=c(0,0,1,-1,0,0),
"GC vs GE"=c(0,0,1,0,-1,0),
"GABF vs GCDE"=c(1,1,-1,-1,-1,1))
A
[1] A A A A A A A A A A A A B B B B B B B B B B B B C C C C C C C C C
[34] C C C D D D D D D D D D D D D E E E E E E E E E E E E F F F F F F
[67] F F F F F F
attr(,"contrasts")
GA vs GB GAB vs GF GC vs GD GC vs GE GABF vs GCDE
A 1 1 0 0 1
B -1 1 0 0 1
C 0 0 1 1 -1
D 0 0 -1 0 -1
E 0 0 0 -1 -1
F 0 -2 0 0 1
Levels: A B C D E F
aov(df$count ~ A)
Call:
aov(formula = df$count ~ A)
Terms:
A Residuals
Sum of Squares 2668.833 1015.167
Deg. of Freedom 5 66
Residual standard error: 3.921902
Estimated effects may be unbalanced
summary.lm(aov(df$count ~ A))
Call:
aov(formula = df$count ~ A)
Residuals:
Min 1Q Median 3Q Max
-8.333 -1.958 -0.500 1.667 9.333
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 9.500e+00 4.622e-01 20.554 <2e-16 ***
AGA vs GB -4.167e-01 8.006e-01 -0.520 0.604
AGAB vs GF -5.833e-01 4.622e-01 -1.262 0.211
AGC vs GD -1.417e+00 9.244e-01 -1.533 0.130
AGC vs GE 8.374e-16 9.244e-01 0.000 1.000
AGABF vs GCDE 6.000e+00 4.622e-01 12.981 <2e-16 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 3.922 on 66 degrees of freedom
Multiple R-squared: 0.7244, Adjusted R-squared: 0.7036
F-statistic: 34.7 on 5 and 66 DF, p-value: < 2.2e-16
Summary shows that sprays A and B, C and D, C and E are indistinguishable as well as combined observations of sprays A and B vs. F, but combined observations for sprays A, B, F are significantly different from C, D and E.
However, separate t-test of C vs. D shows significant difference and equivalence of C and E is almost rejected by t-test.
t.test(df$count[df$spray=="C"], df$count[df$spray=="D"])
Welch Two Sample t-test
data: df$count[df$spray == "C"] and df$count[df$spray == "D"]
t = -3.0782, df = 20.872, p-value = 0.00573
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.7482230 -0.9184437
sample estimates:
mean of x mean of y
2.083333 4.916667
t.test(df$count[df$spray=="C"], df$count[df$spray=="E"])
Welch Two Sample t-test
data: df$count[df$spray == "C"] and df$count[df$spray == "E"]
t = -1.868, df = 21.631, p-value = 0.07536
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-2.9909888 0.1576555
sample estimates:
mean of x mean of y
2.083333 3.500000
These contradicting results are caused by the fact that for comparison of C with D and E the assumption of homoscedasticity leads to overestimation of noise in general framework of ANOVA, but pairwise t-test is only accounting for variances of the compared pairs.
Run MCMC.
<-"
modelStringdata {
int<lower=1> Ntotal;
real y[Ntotal];
int<lower=2> NxLvl;
int<lower=1, upper=NxLvl> x[Ntotal];
real<lower=0> aGammaShRa[2];
}
transformed data {
real meanY;
real sdY;
meanY = mean(y);
sdY = sd(y);
}
parameters {
real<lower=0> nu;
real a0;
real<lower=0> aSigma;
vector[NxLvl] a;
real<lower=0> ySigma[NxLvl];
real<lower=0> ySigmaMode;
real<lower=0> ySigmaSD;
}
transformed parameters{
real<lower=0> ySigmaSh;
real<lower=0> ySigmaRa;
ySigmaRa = ( ( ySigmaMode + sqrt( ySigmaMode^2 + 4*ySigmaSD^2 ) ) / ( 2*ySigmaSD^2 ) );
ySigmaSh = 1 + ySigmaMode * ySigmaRa;
}
model {
nu ~ exponential(1/30.0);
a0 ~ normal(meanY, 10*sdY);
aSigma ~ gamma(aGammaShRa[1], aGammaShRa[2]);
a ~ normal(0, aSigma);
ySigma ~ gamma(ySigmaSh, ySigmaRa);
ySigmaMode ~ gamma(aGammaShRa[1], aGammaShRa[2]);
ySigmaSD ~ gamma(aGammaShRa[1], aGammaShRa[2]);
for ( i in 1:Ntotal ) {
y[i] ~ student_t(nu, a0 + a[x[i]], ySigma[x[i]]);
}
}
generated quantities {
// Convert a0,a[] to sum-to-zero b0,b[] :
real b0;
vector[NxLvl] b;
b0 = a0 + mean(a);
b = a - mean(a);
}
"
<- stan_model(model_code = modelString) model3
# saveRDS(model3,file = "data/stanDSO3.Rds")
<- readRDS(file = "data/stanDSO3.Rds") model3
Run chains.
<- sampling(model3,
fit.sprays data=dataListSprays,
pars=c('b0', 'b', 'aSigma', 'ySigma', 'nu', 'ySigmaMode', 'ySigmaSD'),
iter=5000, chains = 2, cores = 4)
launch_shinystan(fit)
Analyze estimates.
plot(fit.sprays,pars=c("b0","aSigma"))
plot(fit.sprays,pars=c("b"))
plot(fit.sprays,pars=c("ySigma"))
Extract chains.
<- rstan::extract(fit.sprays)
chains_sprays head(chains_sprays$b)
iterations [,1] [,2] [,3] [,4] [,5] [,6]
[1,] 5.321336 5.903195 -7.231029 -5.126114 -5.243187 6.375798
[2,] 6.009771 4.809611 -7.284751 -4.387986 -6.658077 7.511432
[3,] 3.685636 5.496733 -6.662345 -3.278268 -5.640332 6.398575
[4,] 3.609592 4.813631 -6.971538 -2.848964 -6.461671 7.858950
[5,] 7.089465 4.230392 -7.824936 -5.013679 -6.613809 8.132567
[6,] 4.836717 6.068238 -6.645967 -4.104199 -5.881397 5.726608
Look at the same contrasts.
<- chains_sprays$b[,1] - chains_sprays$b[,2]
contrast1_2 plot(contrast1_2)
hist(contrast1_2)
<-hdi(contrast1_2)) (hdiContrast1_2
lower upper
-4.566850 3.072822
attr(,"credMass")
[1] 0.95
<-sd(contrast1_2)) (sd.contrast1_2
[1] 1.92444
plot(rank(chains_sprays$b[,1]),rank(chains_sprays$b[,5]))
The contrast is not different from zero.
<-chains_sprays$b[,1] + chains_sprays$b[,2]
Comb1<-2*chains_sprays$b[,6]
Comb2<-Comb1 - Comb2
contrast12_6 plot(contrast12_6)
hist(contrast12_6)
<-hdi(contrast12_6)) (hdiContrast12_6
lower upper
-11.15426 5.21297
attr(,"credMass")
[1] 0.95
<-sd(contrast12_6)) (sd.contrast12_6
[1] 4.148463
plot(rank(Comb1),rank(Comb2))
The contrast is not different from zero.
<- chains_sprays$b[,3] - chains_sprays$b[,4]
contrast3_4 plot(contrast3_4)
hist(contrast3_4)
<-hdi(contrast3_4)) (hdiContrast3_4
lower upper
-4.8309758 -0.7441633
attr(,"credMass")
[1] 0.95
<-sd(contrast3_4)) (sd.contrast3_4
[1] 1.01399
plot(rank(chains_sprays$b[,3]),rank(chains_sprays$b[,4]))
This contrast is different from zero. ANOVA could not detect that.
<- chains_sprays$b[,3] - chains_sprays$b[,5]
contrast3_5 plot(contrast3_5)
hist(contrast3_5)
<-hdi(contrast3_5)) (hdiContrast3_5
lower upper
-3.206783 0.386665
attr(,"credMass")
[1] 0.95
<-sd(contrast3_5)) (sd.contrast3_5
[1] 0.9066876
plot(rank(chains_sprays$b[,3]),rank(chains_sprays$b[,5]))
The contrast is not different from zero.
<-chains_sprays$b[,1] + chains_sprays$b[,2] + chains_sprays$b[,6]
Comb1<-chains_sprays$b[,3] + chains_sprays$b[,4] + chains_sprays$b[,5]
Comb2<-Comb1 - Comb2
contrast126_345 plot(contrast126_345)
hist(contrast126_345)
<-hdi(contrast126_345)) (hdiContrast126_345
lower upper
29.35368 40.63068
attr(,"credMass")
[1] 0.95
<-sd(contrast126_345)) (sd.contrast126_345
[1] 2.911281
plot(rank(Comb1),rank(Comb2))
The contrast is significantly different from zero.
Adapted from UC’s coursework
If you see mistakes or want to suggest changes, please create an issue on the source repository.
Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/hai-mn/hai-mn.github.io, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".
For attribution, please cite this work as
Nguyen (2022, Feb. 28). HaiBiostat: Series 6 -- ANOVA & ANCOVA. Retrieved from https://hai-mn.github.io/posts/2022-02-28-Bayesian methods - Series 6 of 10/
BibTeX citation
@misc{nguyen2022series, author = {Nguyen, Hai}, title = {HaiBiostat: Series 6 -- ANOVA & ANCOVA}, url = {https://hai-mn.github.io/posts/2022-02-28-Bayesian methods - Series 6 of 10/}, year = {2022} }